Restructuring Lemmas in a Dictionary of Serbian
نویسندگان
چکیده
Traditionally produced lexical resources for Serbo-Croatian are not suitable for automatic processing of contemporary Serbian. More specifically, the processes of structural derivation, although very productive in Serbian, are not presented in either monolingual or bilingual dictionaries in a systematic way. The morphological e-dictionary of Serbian was initially produced on the basis of traditional resources and as such reproduces the same flaws. In order to overcome them two solutions are possible. One is to put in the edictionary all the lemmas produced by structural derivation, no matter whether they are recorded in traditional dictionaries and confirmed in the corpus of contemporary Serbian, and then to assemble them explicitly in a complex lemma by an appropriate lexical graph. This approach, however, implies an overproduction of lemmas and the construction of such a graph for each complex lemma. Another approach is to extrapolate the missing lemmas using morphological grammars that model specific morphological processes and that are applied only to the entries already in the dictionary, which is checked by using the lexical constraints. It is demonstrated that such morphological grammars enable a precise classification of processes of structural derivation in a way similar to the classification of inflective phenomena.
منابع مشابه
An Effective Method for Developing a Comprehensive Morphological E-dictionary of Compounds
In this paper we present the process of creating a comprehensive morphological dictionary of compounds for Serbian. This dictionary should be compatible with existing large morphological dictionaries of simple words for Serbian. Due to the complexity of Serbian morphology, the production of a dictionary of compounds is not an easy task. In this paper we present a procedure that automatically pr...
متن کاملDerivational Morphology in an E-Dictionary of Serbian
In this paper we explore the relation between derivational morphology and synonymy in connection with an electronic dictionary, inspired by the work of Maurice Gross. The characteristics of this relation are illustrated by derivation in Serbian, which produces new lemmas with predictable meaning. We call this regular derivation. We then demonstrate how this kind of derivation is handled in text...
متن کاملRegular Derivation and Synonymy in an E-Dictionary of Serbian DUŠKO VITAS and CVETANA KRSTEV
In this paper we explore the relation between derivational morphology and synonymy in connection with an electronic dictionary, inspired by the work of Maurice Gross. The characteristics of this relation are illustrated by derivation in Serbian, which produces new lemmas with predictable meaning. We call this regular derivation. We then demonstrate how this kind of derivation is handled in text...
متن کاملAcquisition of domain-specific multiword expressions in Serbian
Introduction The term “multiword expression” (MWE) denotes linguistic expressions composed of two or more words functioning as a single unit at semantic level (Calzolari et al., 2002). Their processing is a major challenge for computer science, due to their non-compositionality (the meaning of the expression cannot be determine from the meanings of its components), as one of the main characteri...
متن کاملComposite Tense Recognition and Tagging in Serbian
The technology of finite-state transducers is implemented to recognize, lemmatize and tag composite tenses in Serbian in a way that connects the auxiliary and main verb. The suggested approach uses a morphological electronic dictionary of simple words and appropriate local grammars.
متن کامل